Overview

Dataset Statistics

Number of Variables 4
Number of Rows 16000
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.7 MB
Average Row Size in Memory 113.6 B
Variable Types
  • Numerical: 3
  • Categorical: 1

Dataset Insights

item_id is skewed Skewed
rating is skewed Skewed
name has a high cardinality: 3371 distinct values High Cardinality
rating has 2927 (18.29%) zeros Zeros

Variables


user_id

numerical

Approximate Distinct Count 12290
Approximate Unique (%) 76.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 36883.7271
Minimum 1
Maximum 73504
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • user_id is skewed left (γ1 = -0.0354)

Quantile Statistics

Minimum 1
5-th Percentile 3943.6
Q1 19284.75
Median 37110.5
Q3 54957.75
95-th Percentile 69219.05
Maximum 73504
Range 73503
IQR 35673

Descriptive Statistics

Mean 36883.7271
Standard Deviation 20978.0295
Variance 4.4008e+08
Sum 5.9014e+08
Skewness -0.03536
Kurtosis -1.1991
Coefficient of Variation 0.5688

item_id

numerical

Approximate Distinct Count 3371
Approximate Unique (%) 21.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 8972.6627
Minimum 1
Maximum 34240
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • item_id is skewed right (γ1 = 0.9854)

Quantile Statistics

Minimum 1
5-th Percentile 121
Q1 1239
Median 6325
Q3 14345
95-th Percentile 28623
Maximum 34240
Range 34239
IQR 13106

Descriptive Statistics

Mean 8972.6627
Standard Deviation 8928.6243
Variance 7.972e+07
Sum 1.4356e+08
Skewness 0.9854
Kurtosis -0.02286
Coefficient of Variation 0.9951
  • item_id is not normally distributed (p-value 1.4098971951720602e-18)
  • item_id has 8 outliers

rating

numerical

Approximate Distinct Count 11
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 256000
Mean 6.376
Minimum 0
Maximum 10
Zeros 2927
Zeros (%) 18.3%
Negatives 0
Negatives (%) 0.0%
  • rating is skewed left (γ1 = -1.0446)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 5
Median 7
Q3 9
95-th Percentile 10
Maximum 10
Range 10
IQR 4

Descriptive Statistics

Mean 6.376
Standard Deviation 3.3363
Variance 11.1306
Sum 102016
Skewness -1.0446
Kurtosis -0.2811
Coefficient of Variation 0.5233
  • rating is not normally distributed (p-value 4.1124352842382076e-10)

name

categorical

Approximate Distinct Count 3371
Approximate Unique (%) 21.1%
Missing 0
Missing (%) 0.0%
Memory Size 1449044

Length

Mean 23.0792
Standard Deviation 14.3642
Median 19
Minimum 1
Maximum 98

Sample

1st row Naruto
2nd row Naruto
3rd row Naruto
4th row Naruto
5th row Naruto

Letter

Count 310026
Lowercase Letter 258145
Space Separator 43932
Uppercase Letter 51881
Dash Punctuation 1810
Decimal Number 3033
  • name contains many words: 4670 words

Interactions

Correlations

Missing Values